Established in December 2020 and located in Dallas Texas (Lower Greenville), Son of a Butcher is a fast-food establishment specialized in hamburgers, that serves up elevated versions of traditional burger sliders, shakes, and fries. The restaurant uses organic Wagyu beef and they characterized by breading their own chicken sliders and making their own black bean patties in-house for vegetarian lovers.
Public opinion plays a pivotal role in shaping the success of a restaurant, with platforms like Yelp, TripAdvisor or Google Maps acting as influential conduits for customer feedback. The impact of online reviews on potential diners cannot be overstated, as individuals often turn to these platforms to gauge the experiences of others before deciding where to dine. The collective sentiments expressed through customer reviews provide invaluable insights into the quality of service, food, and overall dining experience. Positive reviews not only attract new customers but also contribute to the establishment’s positive reputation, fostering trust and loyalty. Conversely, negative reviews can dissuade potential patrons and have a lasting impact on the restaurant’s image. In an era where online presence is integral to business success, understanding and managing public opinion is essential for restaurants to thrive and maintain a competitive edge in the dynamic culinary landscape.
The primary objective of the present study, is to glean valuable insights into the overall public sentiment regarding the establishment known as ‘Son of a Butcher’, over time. In pursuit of this goal, data was extracted from the Yelp website dedicated to this restaurant https://www.yelp.com/biz/son-of-a-butcher-dallas-3?sort_by=date_asc]. The extraction process involved utilizing R code, can be found in the following link:
Structure of the Analysis:
The dataset, named “df_SOB,” encompasses 13 variables pertinent to our analysis. These include “User names” representing customer names, “rating” indicating the grade assigned by customers to the establishment on a scale of 1 to 5 stars, “Date” (month/day/year) which has been systematically divided into three distinct variables—namely “weekday,” “month,” and “year”—to facilitate more granular temporal insights. Similarly, the “location” variable, denoting the city and state, has been dissected into “state” for enhanced geographical analysis. The variable “comment” contains the textual content of customer reviews, offering qualitative insights into their experiences. Additionally, three variables—“Useful,” “cool,” and “funny”—capture quantitative measures representing the interactions of other customers with the reviews, indicating the perceived utility, coolness, or humor in the shared feedback. This comprehensive dataset forms the foundation for our in-depth exploration of the public opinion landscape surrounding the establishment. Provide a brief overview of how you plan to structure your exploratory data analysis. Mention the key sections or themes you will be exploring in the subsequent parts of your report.
Figure 1. Rating mean over the years.
As highlighted in the introduction, the dataset under consideration spans from the restaurant’s opening date in December 2020 to mid-January 2024, encapsulating a diverse range of perspectives. Nevertheless, for the present study reviews from year 2024, will be excluded due to limited available data. With a total of 328 customer reviews, the summary statistics (Table 1) provide insights over this timeframe, revealing a global mean rating of 4.235 and a median of 5 stars. A more nuanced exploration of the ratings over the years exposes, around a 10% decrease from the opening year 2020, compare to the following years (4.7 to 4.15 - 4.25) (Figure 1). However, it is essential to acknowledge the significant variation in the number of customer reviews from year to year (Table 2). It’s noteworthy that for the year 2020, the dataset only captures data for the inaugural month of December, potentially skewing reviews to be more positive due to the novelty factor. Table 2 reveals that 2021 amassed the highest number of reviews (168), followed by a considerable decline in the subsequent years (83 and 53).
| rating | useful | funny | cool | rating_f | |
|---|---|---|---|---|---|
| Min. :1.000 | Min. : 0.0000 | Min. : 0.0000 | Min. : 0.0000 | 1: 17 | |
| 1st Qu.:4.000 | 1st Qu.: 0.0000 | 1st Qu.: 0.0000 | 1st Qu.: 0.0000 | 2: 13 | |
| Median :5.000 | Median : 0.0000 | Median : 0.0000 | Median : 0.0000 | 3: 35 | |
| Mean :4.235 | Mean : 0.8902 | Mean : 0.3476 | Mean : 0.7104 | 4: 74 | |
| 3rd Qu.:5.000 | 3rd Qu.: 1.0000 | 3rd Qu.: 0.0000 | 3rd Qu.: 0.0000 | 5:189 | |
| Max. :5.000 | Max. :50.0000 | Max. :31.0000 | Max. :48.0000 | NA |
Table 1. Rating mean over the years.
| Year | Total_reviews |
|---|---|
| 2020 | 23 |
| 2021 | 168 |
| 2022 | 83 |
| 2023 | 53 |
| 2024 | 1 |
Table 2. Total customers reviews per year.
Examining the distribution of ratings over the years reveals a consistent pattern in the mid and lower star categories (1-3 stars), showing minimal fluctuation. The proportion of customers assigning 1, 2, and 3 stars remains relatively low across the years, with only 4-6% opting for 1 star, 0-5% for 2 stars, and 4-12% for 3 stars.
Notably, data from the restaurant’s opening year, limited to December, indicates a predominant 87% of customers rating the establishment with 5 stars. However, this trend experiences a substantial decline in subsequent years (2021, 2022, and 2023), with only 57%, 58%, and 49% of customers, respectively, awarding 5 stars. Conversely, the trend reverses for 4-star ratings: in 2020, a mere 4% of customers rated with 4 stars, gradually increasing to 21%, 24%, and 32% in 2021, 2022, and 2023. Taken together, these findings suggest a notable reduction in the percentage of customers awarding 5 stars throughout the analyzed period. Excluding the year 2020, this reduction becomes particularly prominent from 2022 (58% with 5 stars) to 2023 (49% with 5 stars), while 4-star ratings witness an increment from 24% to 32% over the same period.
Figure 2. a) Rating Histogram and b) Rating distribution, over the years.
To uncover patterns in rating trends, the distribution of rating stars, both in terms of frequencies and proportions, was visualized across different months. Analyzing the aggregated data across all years, we observe that March, April, and December stand out as months with higher volumes of customer reviews (Figure 3a). The highest ratings tend to occur in the months of January, June, and December. Conversely, the lowest ratings are more frequent in the months of May, April, and June (Figure 3b).
Figure 3. a) Rating Histogram and b) Rating distribution, over Months.
A closer examination of individual years, as illustrated in Figure 4a, reveals a concentration of reviews in the last month of 2020 (December) and the first semester of 2021. In the subsequent two years (2022 and 2023), there is a noticeable decline in the number of customer reviews. However, the 2021 trend persists during the first semester.
Customer proportions of 5 stars rating appear to be consistently higher during the months of January (75-100%) and from May to July (50-100%) across different years (Figure 4b). Ratings between 1 and 2 stars were never higher than 33.3%, and in several months, these rates were even nonexistent. This suggests an overall very good restaurant rating. It is noteworthy, however, that months with the lowest ratings (if we aggregate 1 and 2 stars) occur during the last year analyzed (2023) in the months of August(33%), January and November (25%).
Figure 4. a) Rating Histogram per month and b) Rating distribution per month, over the years.
To investigate the potential influence of weekdays on customer ratings, an analysis of the frequency and proportion distribution was conducted. Figure 5a illustrates that Thursdays through Sundays experienced higher volumes of customer reviews. Although no specific weekday stands out for the highest occurrence of 5-star ratings, weekends and Mondays exhibit increased frequencies. Additionally, these days show a higher percentage of lower-star ratings when aggregated across the years 2020, 2021, 2022, and 2023 (Figure 5a and b).
Figure 5. a) Rating Histogram per weekday and b) Rating distribution per weekday.
Upon a yearly breakdown of the data, a consistent pattern emerges where more reviews are submitted during weekends. However, this trend is specifically notable for the years 2021 and 2022. In 2023, Thursdays and Fridays exhibited higher frequencies of customer ratings, as illustrated in figure 6. Notably, in the year 2022, those same days (Thursdays and Fridays) experienced higher star ratings, with 50% of customers rating the establishment either with 5 or 4 stars. However, in the subsequent year, this proportion decreased to 20-33% (4 and 5 stars), while the proportion of 1-2 stars (combined) increased to 33-40% for these same days(Thursdays and Fridays). On the other hand, in 2023 the percentage of lower ratings decreased not only for weekends, but also for Mondays Tuesdays, compared to year 2022 (figure 6b).
Figure 6. a) Rating Histogram per weekday and b) Rating proportion distribution per weekday.
In summary, the rating trend for the past year indicates a significant decline in 5-star ratings, dropping from 87% at its opening to 49%, resulting in a 15% decrease in customers giving a 5-star rating compared to 2022. Conversely, 4-star ratings increased by 33% compared to the previous year. Notably, the percentage of lower star ratings (1, 2, and 3) did not undergo significant changes over the recent years.
To determine actionable steps for improving the overall rating trend, the focus will be on the last two years: 2022 and 2023. When examining the data by months, standout months in 2023 with excellent customer evaluations of 5 stars (ranging from 50% to 100% of customers) include June, October, March, December, and January. While all these months improved the rating compared to 2022, January witnessed a decrease in the percentage of 5-star ratings.
In January, April, August, and November of 2023, there was a noticeable rise in the percentage of lower star ratings (1 and 2 stars). Particularly, April emerged as a month with high number of negative reviews in both years. In contrast, May, June, and September witnessed a significant improvement in customer ratings, with none of the clients providing a negative review.
In 2023, Mondays and Tuesdays exhibited a noteworthy enhancement in customer ratings, with 50% of clients awarding either 4 or 5 stars, marking a significant increase from the previous year. Conversely, a reversal in this trend was observed for Thursdays and Fridays. Notably, these days see a higher frequency of customer reviews on Yelp, suggesting potential areas for improvement. Lastly, Saturdays and Sundays also displayed improvements compared to 2022.
Figure 7. Sentiment Analysis Histogram.
Figure 8. Sentiment Analysis over the years.
To gain insights into customer sentiments expressed in Yelp comments, a sentiment analysis was conducted using the General Inquirer system developed by social psychologists at Harvard University, which relies on the Harvard IV Dictionary.
In this initial analysis, each comment was assigned a sentiment score (SentimentGI). Negative scores denote negative sentiments, positive scores indicate positive sentiments, and values near zero represent neutral sentiments.
Overall, the sentiment analysis suggests that the predominant sentiment expressed in the comments is positive, followed by a smaller fraction expressing negative sentiment, while the presence of neutral sentiment is negligible (see figure 7). This trend remains consistent across the years, as illustrated in figure 8. Specifically, there was a peak in customer posts in 2021, which gradually decreased in the following years.
To explore specific keywords that commonly appear in positive or negative reviews, we analyzed the top 10 words associated with each sentiment category using the “loughran” lexicon. As previously demonstrated, positive sentiment prevails, with frequently occurring words such as ‘good,’ ‘great,’ and ‘friendly.’ This suggests that in addition to food quality, customers emphasize the importance of friendly service, a key aspect that undoubtedly contributes to remarkable ratings
Figure 9. Top 10 words contribution to sentiment.
As described in the introduction, Yelp’s website provides users with the option to engage with customer reviews by selecting one of three reactions: ‘useful,’ ‘cool,’ or ‘funny.’ To explore potential variations in user interaction based on review ratings, we plotted the counts of these interaction variables against the ratings. Figure 10 illustrates that while the majority of comments receive no user feedback, there appears to be a slight positive trend where user interaction increases with higher ratings, primarily through ‘useful’ and ‘cool’ reactions.
When analyzing the relationship between comments’ sentiment scores (SentimentGI) and user interactions (useful, cool, and funny), no discernible pattern emerged. This suggests that Yelp users do not exhibit any specific reaction patterns in response to the language used in reviews (data not shown).
Figure 10. Relationship between User Interactions with Customer Reviews and Ratings.
As previously noted, Son of a Butcher restaurant is situated in Texas, hence it’s unsurprising that the majority of reviews originate from this state (out of the 328 posts, approximately 240 are from Texas). However, California and Florida emerge as the second and third most frequent states, respectively (see figure 11).
To explore potential correlations between geographical location and average ratings, ratings were plotted against different locations. However, no significant pattern was observed (see figure 12).
Figure 11. Customers reviews per State.
Figure 12. Users interactions with customers reviews.